53 research outputs found

    The sequence matters: A systematic literature review of using sequence analysis in Learning Analytics

    Full text link
    Describing and analysing sequences of learner actions is becoming more popular in learning analytics. Nevertheless, the authors found a variety of definitions of what a learning sequence is, of which data is used for the analysis, and which methods are implemented, as well as of the purpose and educational interventions designed with them. In this literature review, the authors aim to generate an overview of these concepts to develop a decision framework for using sequence analysis in educational research. After analysing 44 articles, the conclusions enable us to highlight different learning tasks and educational settings where sequences are analysed, identify data mapping models for different types of sequence actions, differentiate methods based on purpose and scope, and identify possible educational interventions based on the outcomes of sequence analysis.Comment: Submitted to the Journal of Learning Analytic

    Impact of annotation modality on label quality and model performance in the automatic assessment of laughter in-the-wild

    Full text link
    Laughter is considered one of the most overt signals of joy. Laughter is well-recognized as a multimodal phenomenon but is most commonly detected by sensing the sound of laughter. It is unclear how perception and annotation of laughter differ when annotated from other modalities like video, via the body movements of laughter. In this paper we take a first step in this direction by asking if and how well laughter can be annotated when only audio, only video (containing full body movement information) or audiovisual modalities are available to annotators. We ask whether annotations of laughter are congruent across modalities, and compare the effect that labeling modality has on machine learning model performance. We compare annotations and models for laughter detection, intensity estimation, and segmentation, three tasks common in previous studies of laughter. Our analysis of more than 4000 annotations acquired from 48 annotators revealed evidence for incongruity in the perception of laughter, and its intensity between modalities. Further analysis of annotations against consolidated audiovisual reference annotations revealed that recall was lower on average for video when compared to the audio condition, but tended to increase with the intensity of the laughter samples. Our machine learning experiments compared the performance of state-of-the-art unimodal (audio-based, video-based and acceleration-based) and multi-modal models for different combinations of input modalities, training label modality, and testing label modality. Models with video and acceleration inputs had similar performance regardless of training label modality, suggesting that it may be entirely appropriate to train models for laughter detection from body movements using video-acquired labels, despite their lower inter-rater agreement

    Context Cues For Classification Of Competitive And Collaborative Overlaps

    Get PDF
    Being able to respond appropriately to users’ overlaps should be seen as one of the core competencies of incremental dialogue systems. At the same time identifying whether an interlocutor wants to support or grab the turn is a task which comes naturally to humans, but has not yet been implemented in such systems. Motivated by this we first investigate whether prosodic characteristics of speech in the vicinity of overlaps are significantly different from prosodic characteristics in the vicinity of non-overlapping speech. We then test the suitability of different context sizes, both preceding and following but excluding features of the overlap, for the automatic classification of collaborative and competitive overlaps. We also test whether the fusion of preceding and succeeding contexts improves the classification. Preliminary results indicate that the optimal context for classification of overlap lies at 0.2 seconds preceding the overlap and up to 0.3 seconds following it. We demonstrate that we are able to classify collaborative and competitive overlap with a median accuracy of 63%

    Who Will Get the Grant ? A Multimodal Corpus for the Analysis of Conversational Behaviours in Group

    Get PDF
    In the last couple of years more and more multimodal corpora have been created. Recently many of these corpora have also included RGB-D sensors' data. However, there is to our knowledge no publicly available corpus, which combines accurate gaze-tracking, and high- quality audio recording for group discussions of varying dynamics. With a corpus that would fulfill these needs, it would be possible to investigate higher level constructs such as group involvement, individual engagement or rapport, which all require multi-modal feature extraction. In the following paper we describe the design and recording of such a corpus and we provide some illustrative examples of how such a corpus might be exploited in the study of group dynamics

    The Influence of Syntactic Boundaries on Place Assimilation in German

    Get PDF
    Oertel C, Windmann A. The Influence of Syntactic Boundaries on Place Assimilation in German. Presented at the GLOW 33, Wroclaw, Poland

    Modelling Engagement in Multi-Party Conversations : Data-Driven Approaches to Understanding Human-Human Communication Patterns for Use in Human-Robot Interactions

    No full text
    The aim of this thesis is to study human-human interaction in order to provide virtual agents and robots with the capability to engage into multi-party-conversations in a human-like-manner. The focus lies with the modelling of conversational dynamics and the appropriate realization of multi-modal feedback behaviour. For such an undertaking, it is important to understand how human-human communication unfolds in varying contexts and constellations over time. To this end, multi-modal human-human corpora are designed as well as annotation schemes to capture conversational dynamics are developed. Multi-modal analysis is carried out and models are built. Emphasis is put on not modelling speaker behaviour in general and on modelling listener behaviour in particular. In this thesis, a bridge is built between multi-modal modelling of conversational dynamics on the one hand multi-modal generation of listener behaviour in virtual agents and robots on the other hand. In order to build this bridge, a unit-selection multi-modal synthesis is carried out as well as a statistical speech synthesis of feedback. The effect of a variation in prosody of feedback token on the perception of third-party observers is evaluated. Finally, the effect of a controlled variation of eye-gaze is evaluated, as is the perception of user feedback in human-robot interaction.​QC 20161214</p

    A Gaze-based Method for Relating Group Involvement to Individual Engagement in Multimodal Multiparty Dialogue ABSTRACT

    No full text
    This paper is concerned with modelling individual engagement and group involvement as well as their relationship in an eight-party, mutimodal corpus. We propose a number of features (presence, entropy, symmetry and maxgaze) that summarise different aspects of eye-gaze patterns and allow us to describe individual as well as group behaviour in time. We use these features to define similarities between the subjects and we compare this information with the engagement rankings the subjects expressed at the end of each interactions about themselves and the other participants. We analyse how these features relate to four classes of group involvement and we build a classifier that is able to distinguish between those classes with 71 % of accuracy

    Context cues for classification of competitive and collaborative overlaps

    Get PDF
    Oertel C, Wlodarczak M, Tarasov A, Campbell N, Wagner P. Context cues for classification of competitive and collaborative overlaps. In: Proceedings of Speech Prosody 2012. 2012: 721-724

    Automatic Prominence Annotation of a German Speech Synthesis Corpus: Towards Prominence-Based Prosody Generation for Unit Selection Synthesis

    Get PDF
    This paper describes work directed towards the development of a syllable prominence-based prosody generation functionality for a German unit selection speech synthesis system. A general concept for syllable prominence-based prosody generation in unit selection synthesis is proposed. As a first step towards its implementation, an automated syllable prominence annotation procedure based on acoustic analyses has been performed on the BOSS speech corpus. The prominence labeling has been evaluated against an existing annotation of lexical stress levels and manual prominence labeling on a subset of the corpus. We discuss methods and results and give an outlook on further implementation steps
    corecore